Two High-Performance Alternatives to ZLIB Scientific-Data Compression
نویسندگان
چکیده
ZLIB is used in diverse frameworks by the scientific community, both to reduce disk storage and to alleviate pressure on I/O. As it becomes a bottleneck on multi-core systems, higher throughput alternatives must be considered, exploring parallelism and/or more effective compression schemes. This work provides a comparative study of the ZLIB, LZ4 and FPC compressors (serial and parallel implementations), focusing on CR, bandwidth and speedup. LZ4 provides very high throughput (decompressing over 1GB/s versus 120MB/s for ZLIB) but its CR suffers a degradation of 5-10%. FPC also provides higher throughputs than ZLIB, but the CR varies a lot with the data. ZLIB and LZ4 can achieve almost linear speedups for some datasets, while current implementation of parallel FPC provides little if any performance gain. For the ROOT dataset, LZ4 was found to provide higher CR, scalability and lower memory consumption than FPC, thus emerging as a better alternative to ZLIB.
منابع مشابه
Improving the I/O Throughput for Data-Intensive Scientific Applications with Efficient Compression Mechanisms
Today’s science is generating significantly larger volume of data than before. Data compression can potentially improve application performance. However, in many scientific applications and especially in large scale parallel scientific applications, each process often just accesses parts of the data. This can result in some data that are decompressed by a process but not used. General compressi...
متن کامل330343-002_High Performance ZLIB Compression on Intel® Architecture Processors White Paper
The need for lossless data compression has grown significantly as the amount of data collected, transmitted, and stored has exploded in recent years. Enterprise applications and storage, such as web servers and databases, are processing this data and the computational burden associated with compression puts a strain on resources. To help alleviate the burden, we introduce an optimized industry ...
متن کاملHigh performance combinatorial algorithm design on the Cell Broadband Engine processor
The Sony–Toshiba–IBM Cell Broadband Engine (Cell/B.E.) is a heterogeneous multicore architecture that consists of a traditional microprocessor (PPE) with eight SIMD co-processing units (SPEs) integrated on-chip. While the Cell/B.E. processor is architected for multimedia applications with regular processing requirements, we are interested in its performance on problems with non-uniform memory a...
متن کاملHigh Performance DEFLATE Compression on Intel® Architecture Processors white paper
There is a critical need for lossless data compression in enterprise storage and applications such as databases and web servers, which process huge amounts of data. DEFLATE is a widely used standard to perform lossless compression, and forms the basis of utilities such as gzip and libraries such as Zlib. In these applications, compression imposes a large computational burden on the servers, and...
متن کاملZLIB Compressed Data Format Specification
This specification defines a lossless compressed data format. The data can be produced or consumed, even for an arbitrarily long sequentially presented input data stream, using only an a priori bounded amount of intermediate storage. The format presently uses the DEFLATE compression method but can be easily extended to use other compression methods. It can be implemented readily in a manner not...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2014